27 research outputs found

    Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduce Framework

    Get PDF
    While many existing formal concept analysis algorithms are efficient, they are typically unsuitable for distributed implementation. Taking the MapReduce (MR) framework as our inspiration we introduce a distributed approach for performing formal concept mining. Our method has its novelty in that we use a light-weight MapReduce runtime called Twister which is better suited to iterative algorithms than recent distributed approaches. First, we describe the theoretical foundations underpinning our distributed formal concept analysis approach. Second, we provide a representative exemplar of how a classic centralized algorithm can be implemented in a distributed fashion using our methodology: we modify Ganter's classic algorithm by introducing a family of MR* algorithms, namely MRGanter and MRGanter+ where the prefix denotes the algorithm's lineage. To evaluate the factors that impact distributed algorithm performance, we compare our MR* algorithms with the state-of-the-art. Experiments conducted on real datasets demonstrate that MRGanter+ is efficient, scalable and an appealing algorithm for distributed problems.Comment: 17 pages, ICFCA 201, Formal Concept Analysis 201

    Formal Concept Analysis via Atomic Priming

    No full text
    Formal Concept Analysis (FCA) looks to decompose a matrix of objects-attributes into a set of sparse matrices capturing the underlying structure of a formal context. We propose a Rank Reduction (RR) method to prime approximate FCAs, namely RRFCA. While many existing FCA algorithms are complete, lectic ordering of the lattice may not minimize search/decomposition time. Initially, RRFCA decompositions are not unique or complete; however, a set of good closures with high support is learned quickly, and then, made complete. RRFCA has its novelty in that we propose a new multiplicative two-stage method. First, we describe the theoretical foundations underpinning our RR approach. Second, we provide a representative exemplar, showing how RRFCA can be implemented. Further experiments demonstrate that RRFCA methods are efficient, scalable and yield time-savings. We demonstrate the resulting methods lend themselves to parallelization
    corecore